Tandem repeats in protein coding regions of primate genes.
نویسندگان
چکیده
Tandem repeats in GenBank primate nucleotide sequences annotated as protein coding regions are analyzed. It is found that only trinucleotide repeats show repeat enrichment well above the threshold of statistical significance. The statistics are improved by a simultaneous search for repeats on both the amino acid and nucleotide levels. The results of the analyses of natural sequences are interpreted by comparing them with the results of the computer simulation of the model dedicated to protein coding regions. According to the simulation results, a limited set of trinucleotides, that is, cgg, ccg, cag, and gaa repeats coding for polyalanine, polyglycine, polyproline, polyglutamine, and polylysine are prone to proliferation. It is also found that within the repeat regions slippage is more frequent by a factor of 10 than point mutations, whereas the ratio of silent versus recognizable point mutations is approximately the same as elsewhere in coding regions. The trinucleotide repeats cover slightly more than 0.3% of the protein coding regions of genes.
منابع مشابه
TRbase: a database relating tandem repeats to disease genes for the human genome
MOTIVATION Tandem repeats are associated with disease genes, play an important role in evolution and are important in genomic organization and function. Although much research has been done on short perfect patterns of repeats, there has been less focus on imperfect repeats. Thus, there is an acute need for a tandem repeats database that provides reliable and up to date information on both perf...
متن کاملP-157: Polymorphic Core Promoter GA-repeats Alter Gene Expression of The Early Embryonic Developmental Genes
Background: We examine the GA-repeat core promoters of MECOM and GABRA3 in human embryonic kidney-293 cell line and show that those GA-repeats have promoter activity,and those different alleles of the repeats can significantly alter gene expression.We propose a novel role for GA-repeat core promoters to regulate gene expression in the genes involved in development and evolution. Materials and M...
متن کاملCorrection: Complete Chloroplast Genome Sequence of Poisonous and Medicinal Plant Datura stramonium: Organizations and Implications for Genetic Engineering
Datura stramonium is a widely used poisonous plant with great medicinal and economic value. Its chloroplast (cp) genome is 155,871 bp in length with a typical quadripartite structure of the large (LSC, 86,302 bp) and small (SSC, 18,367 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,601 bp). The genome contains 113 unique genes, including 80 protein-coding genes, 29 tR...
متن کاملA Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements
X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx...
متن کاملCLONING AND EXPRESSION OF LEISHMANOLYSIN GENE FROM LEISHMANIA MAJOR IN PRIMATE CELL LINES
Leishmanolysin is a worldwide disease that is caused by different species of the genus Leishmania. Leishmanolysin, One of the genes expressed by Leishmania, appears to be an ideal candidate for genetic vaccination. In this study, a full length sequence, which encodes Leishmanolysin functionally critical regions (amino acids 100-579), was cloned from a Leishmania strain endemic to Iran. Analysis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome research
دوره 12 6 شماره
صفحات -
تاریخ انتشار 2002